Autoencoders, Minimum Description Length and Helmholtz Free Energy
نویسندگان
چکیده
An autoencoder network uses a set of recognition weights to convert an input vector into a code vector. It then uses a set of generative weights to convert the code vector into an approximate reconstruction of the input vector. We derive an objective function for training autoencoders based on the Minimum Description Length (MDL) principle. The aim is to minimize the information required to describe both the code vector and the reconstruction error. We show that this information is minimized by choosing code vectors stochastically according to a Boltzmann distribution, where the generative weights define the energy of each possible code vector given the input vector. Unfortunately, if the code vectors use distributed representations, it is exponentially expensive to compute this Boltzmann distribution because it involves all possible code vectors. We show that the recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution and that this approximation gives an upper bound on the description length. Even when this bound is poor, it can be used as a Lyapunov function for learning both the generative and the recognition weights. We demonstrate that this approach can be used to learn factorial codes.
منابع مشابه
Contributions to the Theory of Thermostated Systems II: Least Dissipation of Helmholtz Free Energy in Nano-Biology
In this paper, we develop further the theory of thermostated systems along the lines of our earlier paper ([1] Fox). Two results are highlighted: 1) in the Markov limit of the contracted description, a least dissipation of Helmholtz free energy principle is established; and 2) a detailed account of the appropriateness of this principle for nanobiology, including the evolution of life, is presen...
متن کاملAn Approach to Selecting Putative RNA Motifs Using MDL Principle
The history of molecular biology is punctuated by a series of discoveries demonstrating the surprising breadth of biological roles of ribonucleic acid (RNA). An ensemble of evolutionary related RNA sequences believed to contain signals at sequence and structure level can be exploited to detect motifs common to all or a portion of those sequences. Finding these similar structural features can pr...
متن کاملمحاسبات توماس- فرمی برای تعیین خواص بحرانی ماده هستهای متقارن براساس رهیافت جرم مؤثر تعمیمیافته
Using mean-field and semi-classical approximation of Thomas-Fermi, within a statistical model, equation of state and critical properties of symmetric nuclear matter is studied. In this model, two body and phenomenological interaction of Myers and Swiatecki is used in phase space. By performing a functional variation of the total Helmholtz free energy of system with respect to the nucleonic di...
متن کاملSound Wave Propagation in Viscous Liquid-Filled Non-Rigid Carbon Nanotube with Finite Length
In this paper, numerical results obtained and explained from an exact formula in relation to sound pressure load due to the presence of liquid inside the finite-length non-rigid carbon nanotubes (CNTs), which is coupled with the dynamic equations of motion for the CNT. To demonstrate the accuracy of this work, the obtained formula has been compared to what has been used by other research...
متن کامل23. Bayesian Ying Yang Learning (II): A New Mechanism for Model Selection and Regularization
Efforts toward a key challenge of statistical learning, namely making learning on a finite size of samples with model selection ability, have been discussed in two typical streams. Bayesian Ying Yang (BYY) harmony learning provides a promising tool for solving this key challenge, with new mechanisms for model selection and regularization. Moreover, not only the BYY harmony learning is further j...
متن کامل